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DETAILED ACTION 

This action is responsive to application 10/037,040 filed 12/21/2001. 
Claims 1-25 have been examined. 

Drawings 

The drawings have not been checked to the extent necessary to determine the 
presence of all possible minor errors. Applicants cooperation is required in correcting 
any errors of which applicant may become aware in the drawings. 

The drawings are objected to because: 
• Fig. 4, item 74 would read well labeled PROCESSING ENGINE as suggested on 

page 8, line 27 

A proposed drawing correction or corrected drawings are required in reply to the Office 
action to avoid abandonment of the application. The objection to the drawings will not 
be held in abeyance. 

Specification 

The specification has not been checked to the extent necessary to determine the 
presence of all possible minor errors. Applicant's cooperation is required in correcting 
any errors of which applicant may become aware in the specification. 

The disclosure is objected to because of the following informalities: 
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- The discussion of Fig. 1 in the Background of the Invention section page 3, line 
17 through page 5, line 13 would read well placed in the Detailed Description of 
the Invention section beginning on page 7 

- '60' on page 10, line 9-1 1 would read well as '80' 

- '64' on page 10, line 1 1 would read well as '84' 
Appropriate correction is required. 

Claim Objections 

Claim 7 and 23 are objected to because of the following informalities: 
Regarding claim 7: 

- 'wherein a' on page 13, line 6 would read well as 'wherein the decision tree structure 
comprises a' 

- 'of the plurality' on page 13, line 7 would read well as 'of a plurality' 
Regarding claim 23: 

- 'method' on page 15, line 19 would read well as 'apparatus' 
Appropriate correction is required. 

Claim Rejections - 35 USC § 101 

35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 
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Claims 1-3, 5-19 and 21-23 are rejected under 35 U.S.C. 101 because the 
claimed invention is directed to non-statutory subject matter. The language of the 
claims (e.g. "search object", "entry", "knowledge base") raise a question as to whether 
the claims are directed merely to an abstract idea that is not tied to a technological art, 
environment or machine which would result in a practical application producing a 
concrete, useful, and tangible result to form the basis of statutory subject matter under 
35 U.S.C. 101 . For example, if the independent claims were amended to recite a 
computer-implemented method or apparatus and required performance of a result 
outside of a computer, it will be statutory in most cases since use of technology permits 
the function of the descriptive material to be realized. 

Claim Rejections - 35 USC § 103 

To expedite a complete examination of the instant application, the claims 
rejected under 35 U.S.C. 1 01 (nonstatutory) above are further rejected as set forth 
below in anticipation of applicant amending these claims to place them within the four 
statutory categories of invention. 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in ection l 02 of this title, if the deferences between the subject matter sought to be patented and 
he prior art are such that the subject matter as a whole would have been obvious at the |t>™ the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 



Application/Control Number: 1 0/037,040 Pa 9 e 5 

Art Unit: 2121 

This application currently names joint inventors. In considering patentability of the 
claims under 35 U.S.C. 103(a), the Office presumes that the subject matter of the 
various claims was commonly owned at the time any inventions covered therein were 
made absent any evidence to the contrary. Applicant is advised of the obligation under 
37 CFR 1 .56 to point out the inventor and invention dates of each claim that was not 
commonly owned at the time a later invention was made in order for the Office to 
consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or (g) 

prior art under 35 U.S.C. 103(a). 

Claims 1-3, 6, 13, 15, 18-19 and 22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Bennett United States Patent Number (USPN) 5,813,001 "Method for 
performing optimized intelligent searches of knowledge bases using submaps 
associated with search objects" (Sep. 22, 1 998) in view of Bialkowski et al USPN 
5,463,777 "System for segmenting data packets to form binary decision trees which 
determine filter masks combined to filter the packets for forwarding" (Oct. 31 , 1 995). 
Regarding claim 1: 
Bennett teaches, 

- A method for determining whether a search object matches an entry in a knowledge 
base (Abstract), wherein the knowledge base comprises a plurality of search nodes, 
and a plurality of links joining two search nodes (Fig. 2; column 6, lines 45-63), said 
method comprising 

- reading (column 4, lines 66-67; column 5, lines 1-22) a first search node from the first 
memory 
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- comparing the first search node with at least a portion of the search object (column 5, 
lines 23-45) 

- based on the comparing step, traversing a search path (column 8, lines 59-61) from 
the first search node to a second search node via the joining link 

However, Bennett doesn't explicitly teach the knowledge base comprises a decision 
tree structure while Bialkowski et al teaches, 

- A method (Abstract; column 1 , lines 62-67), wherein the knowledge base comprises a 
decision tree structure comprising a plurality of search nodes, and a plurality of links 
joining two search nodes (column 3, lines 40-52), said method comprising 

- storing a first portion of the decision tree structure in a first memory (Fig. 1 ; column 2, 
lines 9-22), wherein the first portion comprises a first plurality of search nodes and 
interconnecting links 

- storing a second portion of the decision tree structure in a second memory (Fig. 1 ; 
column 6, lines 22-41), wherein the second portion comprises a second plurality of 
search nodes and interconnecting links 

Motivation - The portions of the claimed method would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
minimum (Bialkowski et al, column 1 , lines 32-39). Therefore, it would have been 
obvious to one of ordinary skill in the art at the time the invention was made, to modify 
Bennett as taught by Bialkowski et al for the purpose of maintaining memory 
requirements. 
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Regarding claim 2: 

The rejection of claim 2 is similar to that for claim 1 as recited above since the stated 
limitations of the claim are set forth in the references. Claim 2's limitations difference is 
taught in Bennett: 

- reading the second search node and comparing at least a portion of the search object 
with the second search node (column 5, lines 7-45) 

Regarding claim 3: 

The rejection of claim 3 is similar to that for claim 1 as recited above since the stated 
limitations of the claim are set forth in the references. Claim 3's limitations difference is 
taught in Bennett: 

- the steps of reading, comparing and traversing are repeated until the second portion of 
the decision tree is traversed to an end thereof (column 16, line's 35-53) 

Regarding claim 6: 

The rejection of claim 6 is the same as that for claim 1 as recited above since the stated 
limitations of the claim are set forth in the references. 
Regarding claim 13: 

The rejection of claim 13 is the same as that for claims 1 and 3 as recited above since 
the stated limitations of the claim are set forth in the references. 
Regarding claim 15: 

The rejection of claim 15 is similar to that for claim 1 as recited above since the stated 
limitations of the claim are set forth in the references. Claim 15's limitations difference 
is taught in Bialkowski et al: 
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- the decision tree structure comprises a plurality of contiguous (column 1 , line 67; 
column 2, lines 1-8) tree levels, wherein each tree level further comprises a search 
node and link to a search node of the next adjacent tree level 

Regarding claim 18: 
Bennett teaches, 

- An apparatus for determining whether a search object matches any entry in a 
knowledge base (Abstract), wherein the knowledge base comprises a plurality of links 
between adjacent search nodes (Fig. 2; column 6, lines 45-63) 

However, Bennett doesn't explicitly teach the knowledge base comprises a decision 
tree structure while Bialkowski et al teaches, 

- An apparatus (Abstract; column 1, lines 62-67) wherein the knowledge base 
comprises a decision tree structure comprising a plurality of links between adjacent 
search nodes (column 3, lines 40-52), said apparatus comprising 

- a first memory storing a first portion of the decision tree structure (Fig. 1 ; column 2, 
lines 9-22) 

- a second memory storing a second portion of the decision tree structure (Fig. 1 ; 
column 6, lines 22-41) 

- a processor (column 1 , lines 22-28) for matching at least a portion of the search object 
with a search node and for traversing through the decision tree structure in response to 
the match 

Motivation - The portions of the claimed apparatus would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
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minimum (Bialkowski et al, column 1, lines 32-39). Therefore, it would have been 
obvious to one of ordinary skill in the art at the time the invention was made, to modify 
Bennett as taught by Bialkowski et al for the purpose of maintaining memory . 
requirements. 
Regarding claim 19: 

The rejection of claim 19 is the same as that for claims 18 and 1 as recited above since 
the stated limitations of the claim are set forth in the references. 
Regarding claim 22: 

The rejection of claim 22 is the same as that for claims 18 and 1 as recited above since 
the stated limitations of the claim are set forth in the references. 

Claims 4 and 20 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Bennett in view of Bialkowski et al and in further view of Pollack et al USPN 6,571 ,238 
"System for regulating flow of information to user by. using time dependent function to 
adjust relevancy threshold" (Filed Jun. 11, 1999). 
Regarding claim 4: 
Bennett teaches, 

- A method for determining whether a search object matches) an entry in a knowledge 
base (Abstract), wherein the knowledge base comprises a plurality of search nodes, 
and a plurality of links joining two search nodes (Fig. 2; column 6, lines 45-63), said 
method comprising 
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- reading (column 4, lines 66-67; column 5, lines 1-22) a first search node from the first 
memory 

- comparing the first search node with at least a portion of the search object (column 5, 
lines 23-45) 

- based on the comparing step, traversing a search path (column 8, lines 59-61) from 
the first search node to a second search node via the joining link 

However, Bennett doesn't explicitly teach the knowledge base comprises a decision 
tree structure or the step of reading is executed by a processor formed in an integrated 
circuit, and wherein the first memory is formed on the integrated circuit, such that the 
step of reading search nodes from the first memory executes faster than the step of 
reading search nodes from the second memory while Bialkowski et al teaches, 

- A method (Abstract; column 1, lines 62-67), wherein the knowledge base comprises a 
decision tree structure comprising a plurality of search nodes, and a plurality of links 
joining two search nodes (column 3, lines 40-52), said method comprising 

- storing a first portion of the decision tree structure in a first memory (Fig. 1 ; column 2, 
lines 9-22), wherein the first portion comprises a first plurality of search nodes and 
interconnecting links 

- storing a second portion of the decision tree structure in a second memory (Fig. 1 ; 
column 6, lines 22-41), wherein the second portion comprises a second plurality of 
search nodes and interconnecting links 

Pollack et al teaches, 

- the step of reading is executed by a processor formed in an integrated circuit, and 
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wherein the first memory is formed on the integrated circuit, such that the step of 
reading search nodes from the first memory executes faster than the step of reading 
search nodes from the second memory (column 10, lines 54-67; column 1 1 , lines 1-3) 
Motivation - The portions of the claimed method would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
minimum (Bialkowski et al, column 1 , lines 32-39) and managing data movement 
(Pollack et al, column 1 1 , lines 6-10). Therefore, it would have been obvious to one of 
ordinary skill in the art at the time the invention was made, to modify Bennett as taught 
by Bialkowski et al and Pollack et al for the purpose of maintaining memory 
requirements and managing data movement. 
Regarding claim 20: 

The rejection of claim 20 is the same as that for claims 18 and 4 as recited above since 
the stated limitations of the claim are set forth in the references. 

Claims 5 and 17 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Bennett in view of Bialkowski et al and in further view of Nakano et al USPN 6,636,802 
"Data structure of digital map file" (PCT Filed Nov. 24, 1999). 
Regarding claim 5: 
Bennett teaches, 

- A method for determining whether a search object matches) an entry in a knowledge 
base (Abstract), wherein the knowledge base comprises a plurality of search nodes, 
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and a plurality of links joining two search nodes (Fig. 2; column 6, lines 45-63), said 
method comprising 

- reading (column 4, lines 66-67; column 5, lines 1-22) a first search node from the first 
memory 

- comparing the first search node with at least a portion of the search object (column 5, 
lines 23-45) 

- based on the comparing step, traversing a search path (column 8, lines 59-61 ) from 
the first search node to a second search node via the joining link 

However, Bennett doesn't explicitly teach the knowledge base comprises a decision 
tree structure or the first portion of the decision tree structure comprises the search 
nodes near the first search entry while Bialkowski et al teaches, 

- A method (Abstract; column 1 , lines 62-67), wherein the knowledge base comprises a 
decision tree structure comprising a plurality of search nodes, and a plurality of links 
joining two search nodes (column 3, lines 40-52), said method comprising 

- storing a first portion of the decision tree structure in a first memory (Fig. 1 ; column 2, 
lines 9-22), wherein the first portion comprises a first plurality of search nodes and 
interconnecting links 

- storing a second portion of the decision tree structure in a second memory (Fig. 1 ; 
column 6, lines 22-41), wherein the second portion comprises a second plurality of 
search nodes and interconnecting links 

Nakano etal teaches, 
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- the first portion of the decision tree structure comprises the search nodes near the first 
search entry (column 41, lines 45-60) 

Motivation - The portions of the claimed method would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
minimum (Bialkowski et al, column 1 , lines 32-39) and speeding up the entry node 
search (Nakano et al, column 41 , lines 60-62). Therefore, it would have been obvious 
to one of ordinary skill in the art at the time the invention was made, to modify Bennett 
as taught by Bialkowski et al and Nakano et al for the purpose of maintaining memory 
requirements and speeding up the entry node search. 
Regarding claim 17: 

The rejection of claim 17 is the same as that for claims 15 and 5 as recited herein since 
the stated limitations of the claim are set forth in the references: 

Claims 7-9, 16 and 21 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Bennett in view of Bialkowski et al and in further view of Vahalia et al 
USPN 6,625,591 "Very efficient in-memory representation of large file system 
directories" (Filed Sep. 29, 2000). 
Regarding claim 7: 
Bennett teaches, 

- A method for determining whether a search object matches) an entry in a knowledge 
base (Abstract), wherein the knowledge base comprises a plurality of search nodes, 
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and a plurality of links joining two search nodes (Fig. 2; column 6, lines 45-63), said 
method comprising 

- reading (column 4, lines 66-67; column 5, lines 1-22) a first search node from the first 
memory 

- comparing the first search node with at least a portion of the search object (column 5, 
lines 23-45) 

- based on the comparing step, traversing a search path (column 8, lines 59-61) from 
the first search node to a second search node via the joining link 

However, Benneff doesn't explicitly teach the knowledge base comprises a decision 
tree structure or a predetermined number of lower levels of the plurality of levels are 
stored in the first memory, and wherein the remaining plurality of levels are stored in the 
second memory while Bialkowski et al teaches, 

- A method (Abstract; column 1 , lines 62-67), wherein the knowledge base comprises a 
decision tree structure comprising a plurality of search nodes, and a plurality of links 
joining two search nodes (column 3, lines 40-52), said method comprising 

- storing a first portion of the decision tree structure in a first memory (Fig. 1 ; column 2, 
lines 9-22), wherein the first portion comprises a first plurality of search nodes and 
interconnecting links 

- storing a second portion of the decision tree structure in a second memory (Fig. 1 ; 
column 6, lines 22-41), wherein the second portion comprises a second plurality of 
search nodes and interconnecting links 

Vahalia et al teaches, 
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- a predetermined number of lower levels of the plurality of levels are stored in the first 
memory, and wherein the remaining plurality of levels are stored in the second memory 
(column 13, line 67; column 14, lines 1-23) 

Motivation - The portions of the claimed method would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
minimum (Bialkowski et al, column 1 , lines 32-39) and managing data movement 
(Pollack et al, column 1 1 . lines 6-10) and accelerating a search (Vahalia et al, column 2, 
lines 47-49). Therefore, it would have been obvious to one of ordinary skill in the art at 
the time the invention was made, to modify Bennett as taught by Bialkowski et al and 
Vahalia et al for the purpose of maintaining memory requirements and accelerating a 
search. 

Regarding claim 8: 

The rejection of claim 8 is the same as that for claim 7 as recited above since the stated 
limitations of the claim are set forth in the references. 
Regarding claim 9: 

The rejection of claim 9 is the same as that for claims 1 and 7 as recited above since 
the stated limitations of the claim are set forth in the references. 
Regarding claim 16: 

The rejection of claim 16 is the same as that for claims 15 and 7 as recited above since 
the stated limitations of the claim are set forth in the references. 
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Regarding claim 21: 

The rejection of claim 21 is the same as that for claims 1 8 and 7 as recited above since 
the stated limitations of the claim are set forth in the references. 

Claims 10-1 1 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Bennett in view of Bialkowski et al and in further view of Friedberg USPN 6,662,1 84 
"Very efficient in-memory representation of large file system directories" (Filed Sep. 22, 
2000). 

Regarding claim 10: 

Bennett teaches, 

- A method for determining whether a search object matches) an entry in a knowledge 
base (Abstract), wherein the knowledge base comprises a plurality of search nodes, 
and a plurality of links joining two search nodes (Fig. 2; column 6, lines 45-63), said 
method comprising 

- reading (column 4, lines 66-67; column 5, lines 1-22) a first search node from the first 
memory 

- comparing the first search node with at least a portion of the search object (column 5, 
lines 23-45) 

- based on the comparing step, traversing a search path (column 8, lines 59-61) from 
the first search node to a second search node via the joining link 
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However, Bennett doesn't explicitly teach the knowledge base comprises a decision 
tree structure or the search object comprises a plurality of symbols while Bialkowski et 
al teaches, 

- A method (Abstract; column 1, lines 62-67), wherein the knowledge base comprises a 
decision tree structure comprising a plurality of search nodes, and a plurality of links 
joining two search nodes (column 3, lines 40-52), said method comprising 

- storing a first portion of the decision tree structure in a first memory (Fig. 1 ; column 2, 
lines 9-22), wherein the first portion comprises a first plurality of search nodes and 
interconnecting links 

- storing a second portion of the decision tree structure in a second memory (Fig. 1 ; 
column 6, lines 22-41), wherein the second portion comprises a second plurality of 
search nodes and interconnecting links 

Friedberg teaches, 

- the search object comprises a plurality of symbols (column 1 0, lines 66-67; column 1 1 , 
lines 1-10) 

Motivation - The portions of the claimed method would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
minimum {Bialkowski et al, column 1 , lines 32-39) and searching and retrieving data 
(Friedberg, Abstract). Therefore, it would have been obvious to one of ordinary skill in 
the art at the time the invention was made, to modify Bennett as taught by Bialkowski et 
al and Friedberg for the purpose of maintaining memory requirements and 
searching/retrieving data. 
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Regarding claim 11: 

The rejection of claim 1 1 is the same as that for claim 10 as recited above since the 
stated limitations of the claim are set forth in the references. 

Claim 12 is rejected under 35 U.S.C. 103(a) as being unpatentable over Bennett 
in view of Bialkowski et al and in further view of Corl et al USPN 6,772,223 
"Configurable classification interface for networking devices supporting multiple action 
packet handling rules" (Filed Apr. 10, 2000). 
Regarding claim 12: 
Bennett teaches, 

- A method for determining whether a search object matches) an entry in a knowledge 
base (Abstract), wherein the knowledge base comprises a plurality of search nodes, 
and a plurality of links joining two search nodes (Fig. 2; column 6, lines 45-63), said 
method comprising 

- reading (column 4, lines 66-67; column 5, lines 1-22) a first search node from the first 
memory 

- comparing the first search node with at least a portion of the search object (column 5, 
lines 23-45) 

- based on the comparing step, traversing a search path (column 8, lines 59-61 ) from 
the first search node to a second search node via the joining link 

However, Benneff doesn't explicitly teach the knowledge base comprises a decision 
tree structure or the knowledge base comprises a classification engine of a 
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communications network processor for determining an attribute of the data input 
thereto, and wherein the second portion of the decision tree ends in a plurality of 
terminating nodes, the method further comprising repeating the steps of reading, 
comparing and traversing until a terminating node is reached, wherein the terminating 
node identifies the attribute of the input data while Bialkowski et al teaches, 

- A method (Abstract; column 1 , lines 62-67), wherein the knowledge base comprises a 
decision tree structure comprising a plurality of search nodes, and a plurality of links 
joining two search nodes (column 3, lines 40-52), said method comprising 

- storing a first portion of the decision tree structure in a first memory (Fig. 1 ; column 2, 
lines 9-22), wherein the first portion comprises a first plurality of search nodes and 
interconnecting links 

- storing a second portion of the decision tree structure in a second memory (Fig. 1 ; 
column 6, lines 22-41 ), wherein the second portion comprises a second plurality of 
search nodes and interconnecting links 

Corl et al teaches, 

- the knowledge base comprises a classification engine of a communications network 
processor for determining an attribute of the data input thereto (column 2, lines 1 1-27), 
and wherein the second portion of the decision tree ends in a plurality of terminating 
nodes, the method further comprising repeating the steps of reading, comparing and 
traversing until a terminating node is reached, wherein the terminating node identifies 
the attribute of the input data (column 2, lines 58-67; column 3 lines 1-28) 
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Motivation - The portions of the claimed method would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
minimum (Bialkowski et al, column 1 , lines 32-39) and defining the types of actions that 
are to be applied to packets processed by a network processor device (Corl et al, 
Abstract). Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made, to modify Bennett as taught by Bialkowski et al and Corl 
et al for the purpose of maintaining memory requirements and defining network 
processor packet action types. 

Claims 14 and 23-25 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Bennett in view of Bialkowski et al and in further view of Benayoun et al USPN 
6,516,319 B1 "Parallelized processing device for processing search keys based upon 
tree structure" (Filed May 11, 2000). 
Regarding claim 14: 
Bennett teaches, 

- A method for determining whether a search object matches) an entry in a knowledge 
base (Abstract), wherein the knowledge base comprises a plurality of search nodes, 
and a plurality of links joining two search nodes (Fig. 2; column 6, lines 45-63), said 
method comprising 

- reading (column 4, lines 66-67; column 5, lines 1-22) a first search node from the first 
memory 
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- comparing the first search node with at least a portion of the search object (column 5, 
lines 23-45) 

- based on the comparing step, traversing a search path (column 8, lines 59-61) from 
the first search node to a second search node via the joining link 

However, Bennett doesn't explicitly teach the knowledge base comprises a decision 
tree structure or each one of the plurality of search nodes comprises an instruction and 
an address field, wherein the step of comparing further comprises comparing at least a 
portion of the search object with the instruction, and wherein the address field 
determines the second search node based on the comparing step while Bialkowskiet al 
teaches, 

- A method (Abstract; column 1 , lines 62-67), wherein the knowledge base comprises a 
decision tree structure comprising a plurality of search nodes, and a plurality of links 
joining two search nodes (column 3, lines 40-52), said method comprising 

- storing a first portion of the decision tree structure in a first memory (Fig. 1 ; column 2, 
lines 9-22), wherein the first portion comprises a first plurality of search nodes and 
interconnecting links 

- storing a second portion of the decision tree structure in a second memory (Fig. 1 ; 
column 6, lines 22-41), wherein the second portion comprises a second plurality of 
search nodes and interconnecting links 

Benayoun et al teaches, 

- each one of the plurality of search nodes comprises an instruction and an address 
field, wherein the step of comparing further comprises comparing at least a portion of 
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the search object with the instruction, and wherein the address field determines the 
second search node based on the comparing step (column 8, lines 15-21) 
Motivation - The portions of the claimed method would have been a highly desirable 
feature in this art for maintaining memory requirements and other hardware needs at a 
minimum (Bialkowski et a/, column 1 , lines 32-39) and searching for the tree leaf 
matching a search key (Benayoun et al. Abstract). Therefore, it would have been 
obvious to one of ordinary skill in the art at the time the invention was made, to modify 
Bennett as taught by Bialkowski et al and Benayoun et al for the purpose of maintaining 
memory requirements and matching a search key. 
Regarding claim 23: 
Bennett teaches, 

- An apparatus for determining whether a search object matches any entry in a 
knowledge base (Abstract), wherein the knowledge base comprises a plurality of links 
connecting adjacent search nodes (Fig. 2; column 6, lines 45-63) 

However, Bennett doesn't explicitly teach the knowledge base comprises a decision 
tree structure or first and second processors while Bialkowski et al teaches, 

- An apparatus (Abstract; column 1 , lines 62-67) wherein the knowledge base 
comprises a decision tree structure comprising a plurality of links connecting adjacent 
search nodes (column 3, lines 40-52), said method comprising 

- a first memory storing a first portion of the decision tree structure (Fig. 1 ; column 2, 
lines 9-22) 
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- a second memory storing a second portion of the decision tree structure (Fig. 1 ; 
column 6, lines 22-41) 

Benayoun et al teaches, 

- a first processor (Fig. 1 , item 30) 

- a second processor (Fig. 1 , item 32) 

- wherein said first processor accesses said first memory, and wherein said second 
processor accesses said second memory for determining the search node that matches 
at least a portion of said search object (column 2, lines 20-46; column 9, lines 45-46) 
Motivation - The portions of the claimed apparatus/method would have been a highly 
desirable feature in this art for maintaining memory requirements and other hardware 
needs at a minimum (Bialkowski et al, column 1 , lines 32-39) and searching for the tree 
leaf matching a search key (Benayoun et al, Abstract). Therefore, it would have been 
obvious to one of ordinary skill in the art at the time the invention was made, to modify 
Bennett as taught by Bialkowski et al and Benayoun et al for the purpose of maintaining 
memory requirements and matching a search key. 

Regarding claim 24: 

The rejection of claim 24 is the same as that for claims 23 and 1 as recited above since 
the stated limitations of the claim are set forth in the references. 
Regarding claim 25: 

The rejection of claim 25 is similar to that for claim 23 as recited above since the stated 
limitations of the claim are set forth in the references. Claim 25's limitations difference 
is taught in Benayoun et al: 
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- the first processor and the second processor simultaneously execute tree searches for 
a plurality of search trees (column 8, lines 42-47) 

Conclusion 

The following prior art made of record is considered pertinent to applicant's 
disclosure: 

- van der Wal et a/; US 5963675 A; Pipelined pyramid processor for image processing 
systems 

- Nagral et a/; US 6260044 B1 ; Information storage and retrieval system for storing and 
retrieving the visual form of information from an application in a database 

- Srivastava et a\\ US 6563952 B1 ; Method and apparatus for classification of high 
dimensional data 

- Tzeng; US 6061712 A; Method for IP routing table look-up 

- Singh et a/; US 5983224 A; Method and apparatus for reducing the computational 
requirements of K-means data clustering 

- lomet; US 461 1272 A; Key-accessed file organization 

- Wu et a/; US 6381607 B1 ; System of organizing catalog data for searching and 
retrieval 

- Israni et a/; US 5968109 A; System and method for use and storage of geographic 
data on physical media 

- Powers ef a/; US 5404513 A; Method for building a database with multi-dimensional 
search tree nodes 
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- Simonetti; US 5295261 A; Hybrid database structure linking navigational fields having 
a hierarchial database structure to informational fields having a relational database 
structure 

- Zellweger, US 5630125 A; Method and apparatus for information management using 
an open hierarchical data structure 

- Marquis; US 5930805 A; Storage and retrieval of ordered sets of keys in a compact 0- 
complete tree 

- Demuynck et at, Bmad-tree: an efficient data structure for parallel processing; Eighth 
IEEE Symposium on Parallel and Distributed Processing; 23-26 Oct. 1996; pp 384-391 

Any inquiry concerning this communication or earlier communications from the 
Office should be directed to Meltin Bell whose telephone number is 571-272-3680. This 
Examiner can normally be reached on Mon - Fri 7:30 am - 4:00 pm. 

If attempts to reach this Examiner by telephone are unsuccessful, his supervisor, 
Anthony Knight, can be reached on 571-272-3687. The fax phone number for the 
organization where this application or proceeding is assigned is (703) 872-9306. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the receptionist whose telephone number is 571-272- 
2100. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see-http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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Abstract 

B-trees are used for accessing large database files, 
stored in lexicographic order on the secondary storage 
devices. Algorithms for concurrent B-tree data struc- 
tures achieve only limited speedup when implemented 
on a parallel computer. To improve the performance, 
we propose a variant of the ff ink -tree t called the B ma - 
tree, which allows insertion without node splits, with 
multiple access in its leaf nodes, and dilation in both the 
index and the leaf nodes. Parallel algorithms for search, 
insert and restructuring are designed for partitioned, 
locked and distributed models. Only part of an inser- 
tion node is locked during the insert, and simultaneous 
insertions by multiple processors in the same node arc 
allowed. A restructuring algorithm runs periodically 
in the background and requires at most one wait by 
any search or update operation. Our implementations 
demonstrate that the ET^-trec algorithms outperform 
the best known B 1 ^ -trees, and compare favorably with 
linear hashing. We achieve good speedup (e.g., 179 
with 8 processors) for partitioned algorithms, and mod- 
erate speedup (2.49 with 8 processors) for locked al- 
gorithms, even including overhead costs. The insert 
times obtained for ET ad -trees are 50% to 60% less than 
that for the & ink -trees in partitioned implementations, 
and 70% to 80% less in locked implementations. The 
speedup results on the distributed memory platform (a 
network of workstations) were not that encouraging due 
to high communication costs. 

1. Introduction 

B-trees are widely used as an access method for large 
ordered files stored on secondary storage devices [l], be- 
cause they provide fast access and easy maintenance. 
A B-tree of order m has the following properties: (i) 

'This work is supported by Texas Advanced Technology Pro- 
gram under Award No. TATP-003594-31. 



every node has at most m and at least [m/2] children, 
(ii) the root has at least two children, unless it is the 
only node, (iii) all leaf nodes appear at the same level 
of the tree, (iv) an internal node with k children, con- 
tains k - 1 key values. This definition guarantees that 
a B-tree is at least half full at all times. There are sev- 
eral variants such as B + -trees, B" n/f -trees and 2-3 trees 
j4, 13]. The B+-tree is a B-tree in which each node is 
at least § full. In the B' infc -fcree, all keys reside in the 
leaves (the sequence set) and internal nodes (the index 
set) are used as an index to the sequence set. The 2-3 
tree is a B-tree of order m = 3. For sequential algo- 
rithms on manipulating various B-trees, readers may 
refer to [1, 4, 13]. . 

Several lock-based concurrent algorithms for the 
standard B-trees and B + -trees are reported in [2, 11, 
14, 25], and for the 2-3 trees in [7]. Mond and Raz 
[20] used preparatory operations, where possible node 
splits and merges are detected and executed during the 
search phase. Mohan [19] suggested locks on individ- 
ual key values instead of entire nodes. Many lock- 
based concurrent algorithms also exist for the B lxnk - 
tree [5, 9, 15, 24], which is a B + -tree variant where a 
pointer is added to each node linking it with its right 
neighbor. Other methods for achieving good perfor- 
mance include the use of cache memory [27], synchro- 
nization [10, 22], and retry strategies [17]. Distributed 
solutions due to Pramanik and Kim [23], and Mat- 
sliach and Shmueli [18], distribute the B-tree among 
the disks, require synchronized disk access mechanisms, 
and allow only a single operation at a time. 

Although many concurrent algorithms have been 
proposed for B-trees and their variants, only a few deal 
with actual performance studies on real multiproces- 
sors. Most of the existing experimental studies are sim- 
ulations of concurrent B-trees on parallel architectures. 
Mukkamala and Shultz [21] measured the performance 
of the B" n *-tree on two simulated architectures, one 
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with shared output devices, and the other with one out- 
put device for each processor. Ford et al. [8] performed 
a simulation study of the concurrent B-trees due to [2] 
and [15] for a central file server application. The simu- 
lation study by Srinivasan and Carey [26] is for the the 
same algorithms used in [8] and the B-tree algorithm 
due to [14]. Colbrook et al. [3] implemented their mes- 
sage passing algorithms on a simulator. Johnson and 
Shasha [12] proposed an analytical performance model 
for concurrent B-tree algorithms, and used simulation 
to substantiate their results. B' in *-trees show the best 
results in all these studies. 

Our earlier experiments [6] on concurrent B imfc -trees 
on real parallel machines showed that a substantial 
amount of time was directly due to node splits. As the 
number of processors increases, the split time in B Un - 
trees may require more than 60% of insert time. We 
also observed a very large overhead in some algorithms 
due to locking. These results motivated us to develop 
a data structure in which splits arc not necessary and 
restructuring is postponed, while nearly preserving se- 
quential ordering of the data. Thus evolved the concept 
of the B mod -tree, which stands for & ink -trce with mul- 
tiple access and dilation. This new data structure is 
found to be more suited for efficient parallel process- 
ing. It allows insertion without node splits, access by 
multiple processors in its leaf nodes, and dilation in 
both the index and the leaf nodes. 

In this paper, we design and implement algorithms 
for construction, search, insert, and restructure of 
gmad f or eacn 0 f t he three models - partitioned, 
locked, and distributed. The partitioned and locked 
algorithms are implemented on the Sequent Symme- 
try S/81 shared memory multiprocessor, and the dis- 
tributed algorithms are implemented on a network of 
processors running the parallel virtual machine (PVM) 
software. Our experimental results compare favorably 
with those from B"' nfc -tree and linear hashing imple- 
mentations [6]. We achieve good speedup for parti- 
tioned algorithms (for example, a speedup of 4.79 with 
8 processors), and moderate speedup for locked algo- 
rithms (speedup 2.49 with 8 processors), even including 
overhead costs. Efficiency is as high as 80% exclud- 
ing overhead costs, and as high as 60% when overhead 
costs are included. The insert times obtained for B mft - 
trees are 50% to 60% less than those for the B ,m -tree 
in partitioned implementations, and 70% to 80% less 
in locked algorithms. The results on the distributed 
memory platform were not that encouraging, due to 
high startup and communication costs which prevented 
good speedup. 

The paper is organized as follows. Section 2 intro- 
duces the B mad -trees, and Section 3 describes parallel 



algorithms for partitioned and locked models. Section 
4 presents the implementation details and discusses the 
experimental results. Conclusions and future research 
are given in Section 5. 

2. The B mad -tree Data Structure 

The motivation behind the B mad -trees originated 
from our experimental research on the concurrent 
B ,<n *-trees and linear hashing on the Sequent Multi- 
processor system [6]. The goal is to minimize very 
expensive lock and node split time, and allow multi- 
ple processors to access the same node during updates. 
The B w -tree does not require node split or merge, 
and restructuring is postponed, while nearly preserv- 
ing data ordering. 

Physically, a B mad -tree is a B' m *-tree where each 
leaf node is organized as a hash table with b buckets. 
The tree has the key values in semi-sorted order in the 
leaf level such that for any two leaf nodes, I« and L ; -, 
where U is to the left of L h all keys in U axe smaller 
than those in L,-. In an initial version of this data struc- 
ture, the leaf nodes were organized into buckets, similar 
to those used in bounded disorder files [16]. However, 
it is not optimal in terms of space utilization. An im- 
proved version of the B marf -tree optimizes leaf node 
space as well. The available key storage space is or- 
ganized as a single array of multiple linked lists, one 
for each bucket. The variable, freelist, heads the list of 
available space; Each node has a header record, which 
points to the top of each bucket's list and counts the 
keys in the bucket. The field, totalcount, counts the 
keys stored in the node. The field, linkf, is the link 
pointer to the next bucket node to the right. When a 
node is full, the next insert goes to an overflow node, 
linked to the current node via an overflow pointer, of- 
ptr. Overflow nodes have exactly the same structure as 
any other leaf node, but the initial one is the primary 
node. A restructure algorithm restores the tree to a 
well-balanced form. 

The modified leaf node structure eliminates wasted 
space. In this B mad -tree, it does not raather if one 
bucket is large while another remains nearly empty. In 
the unlikely event that all the keys hash to the same 
bucket, performance is the same as any other B+-tree. 
The use of many small buckets rather than a few larger 
buckets keeps search times short and allows greater 
concurrency, although this requires more overhead in 
the form of a larger header record. 

3. Parallel Algorithms for the B w -tree 

This section contains an informal description of par- 
allel B mad -tree algorithms for construction, search, in- 
sert and restructure. For formal details, refer to [6]. 
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3.1. Partitioned Algorithms 

Most concurrent algorithms in the literature for B- 
trees use either locks or processor synchronization for 
concurrency control. In partitioned algorithms, intro- 
duced in [6], however, no locks are used and processors 
work independently from each other on a partition of 
the data structure. 

The construction algorithm builds the tree from the 
bottom up, one leaf node at a time. The algorithm 
expects the keys to be in sorted order. New bucket 
nodes are generated and the index is constructed as 
needed, as long as there are keys in the file. Once 
the tree is built, the function balanceJree checks the 
rightmost node at each level and shifts values from its 
left neighbor when an underfull node is found. 

All searches start at the root node. As each index 
node is reached, its obsolete flag is checked. If it is set, 
the node is no longer valid and the search continues 
via the link pointer to the right neighbor. At the leaf 
node level, a search key is hashed to find its bucket. If 
the search key is not found and if there is an overflow 
node, the search resumes in the overflow node, until 
either the search key or the end of the chain is found. 
Since it is the task of the search algorithm to check the 
overflow chain and return the correct insertion location, 
an insert operation only adds new values to the node. 
No data are shifted, no nodes are split or merged, and 
no entries are added to the index. If there is room in 
the node, the new key is inserted at the top of the list in 
the target bucket. In the worst case the current bucket 
node is full, and a new overflow node is created before 
insertion takes place. The new node is then linked to 
the previous one by an overflow pointer. 

The restructure algorithm is triggered when the tree 
has too many overflow nodes. This condition can be 
detected (i) either on a continuous basis as the tree 
accommodates update transactions, (ii) or by a sepa- 
rate function so that each processor keeps track of the 
number of overflow nodes it creates, and those counters 
are periodically checked by an idle processor. The first 
method works very well for sequential and partitioned 
algorithms, since there is no contention to update an 
overflow node counter. The second method is better 
for other types of concurrent algorithms. Levels are re- 
structured one at a time, from left to right, by various 
rebuilding functions, one for the leaf nodes, one for the 
internal index nodes, and another for the root node. 

A restructuring processor checks each leaf node and 
examines the parent node to ensure it is the correct one. 
If not, the link pointer is followed until the correct one 
is found. Each time a parent is first entered, it builds 
a new parent node, ptemp, and marks the old parent 
node obsolete. Each primary leaf node is examined in 



turn. If it has no overflow chain, it remains as is, and its 
maxvalue and pointer are copied to ptcmp. Otherwise, 
assume there are q > 1 nodes in the chain. Keys in 
the primary and the overflow nodes are sorted and q 
new nodes are built and linked, the leRmost to the left 
neighbor and the rightmost one to the right neighbor. 
If there are enough new nodes, premp may fill up. A 
single index node then dilates into multiple nodes as 
shown in Figure 1. When a new parent node is started, 
the previous one is closed. If the last dilation node is 
underfull, enough key/pointer pairs are shifted from 
its left neighbor to keep the tree balanced. A leaf node 
rebuilding process of a chain with 3 nodes, and the 
corresponding changes in the deepest index level, are 
shown in Figure 2. Affected nodes are shown shaded in 
Figures 1 and 2, and all old pointers are shown dashed. 
An x is used to indicate an obsolete flag is set. 




obsolete node , new Index nodes ♦ old link pointer 

I 

Figure 1. Temporarily expanded index node 




Figure 2. Restructuring of leaf nodes 

Index levels are restructured next. Two levels are 
affected at a time, namely a lower level where obso- 
lete nodes are removed, and its parent level where new 
nodes are built. In the former, nodes marked as obso- 
lete are deleted and pointers to the new index nodes 
and their maxkeys are inserted in the new parent nodes. 
An index node with two dilations is shown. in Figure 3. 
The root node is rebuilt as follows. If the level below 
the root node has fewer than m nodes, the old root is 
replaced by a single new one. If there are more than m 
nodes however, the tree expands and gets a new level 
and a new root node. 
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The restructure algorithm requires many duplicate 
nodes. In fact, an index level expands to at least twice 
its former size. The algorithm is improved by replacing 
obsolete parents immediately in those instances where 
there is only one dilation node. This saves space and 
shortens the time it takes to restructure the next level. 



jnathdnnoAt 




J— »| Q — toiuxtnodB 



-1 — 









11 

p 




C 




Figure 3. Newly rebuilt Index level and the 
changes to its parent nodes 

3.2. Parallel Algorithms with Locks 

The search function for the locked algorithms pro- 
ceeds as in the partitioned algorithms. Four cases arise: 
Case 1: No restructuring is in progress in the tree. 
Other cases occur due to the tree reconstruction. 
Case 2: an obsolete node who's replacement node(s) 
is (are) complete. 

Case 3: the replacement node(s) is(are) not finished. 
Case 4: a leaf node is restructuring. 

If no restructuring is in progress, all checks for ob- 
solete nodes are negative. Thus Case 1 covers the ma- 
jority of all search operations. To prevent premature 
access to a not completely redone or to a deleted node, 
each index node has a flag, linkflag, which signals that 
the replacement list is complete. Each search process 
first checks a node's obsolete flag. If the flag is turned 
on, it checks linkflag. If it is set, the replacement nodes 
are ready to be used (Case 2) and the search continues. 
Otherwise the search waits until the list is ready (Case 
3). In Case 4, the search process has reached the last 
index level and gives access to the target node. Delay is 
possible in at most one index node in the entire tree, be- 
cause restructuring proceeds one level at a time. When 
an obsolete node is encountered, the search temporar- 
ily continues to the right instead of downwards through 
link pointers. 

As the insert algorithm does not alter the index set 
of the tree, no locks are needed until the key is inserted 
in the node. There are three possibilities: 

(i) The insert node is not full. A lock on the bucket, its 
header and the head of the free list is obtained. The 
node is rechecked to prevent reinsertion of the same 
record. The node, variables are updated, the lock is 
released, and the new record is added to the bucket. 

(ii) If the insert, node is full, the overflow pointer is 
checked. If an overflow node already exists, the current 



pointer is advanced, and insert completes as in (i). 
(in) The current node is full but a new overflow node 
does not exist yet. A lock is requested on the overflow 
pointer. Once granted, the pointer is rechecked. If it 
is no longer nil, the lock is released and insert reverts 
to (ii). Otherwise, a new node is created and insertion 
completes. The new node is attached to the previous 
and the lock is released. The processor who creates a 
new overflow node increments its overflow counter. 

In the locked restructure algorithm, an idle proces- 
sor checks the restructure control variable, resir-Jlay, 
which can have a value of 0, 1, or 2. If the flag is 0, no 
restructure is in progress, while 1 implies that restruc- 
ture is currently in progress. A completed restructure is 
indicated by flag value 2. When flag is 0, the processor 
counts the total number of overflow nodes and starts 
restructuring if this number exceeds the limit. Initially 
the current processor activates the restructure driver, 
which controls two important, events. First, only one 
processor at a time is allowed to restructure which is 
controlled by a busy flag. The second function of the 
driver is to know which is the next level to restruc- 
ture. This is controlled by a variable rstatus. At 
the leaf level, a node without overflow does not change 
and therefore need not be locked. The insertion of its 
key/ pointer pair in the temporary parent node does 
not require locks either because only the restructuring 
processor has access to a dilation node at this time. 
When the replacement list is complete, the link flag in 
the old node is set. This also does not require a lock, 
since again only the restructuring processor is allowed 
to change that field and the obsolete flag. At the leaf 
level, nodes with chain lengths > 1 are locked. 

3.3. Speedup Analysis 

Given a logical B mad -tree of order m and of n key 
values, the total number of leaf nodes is v = The 
sequential time for single search/insert operation is 
j> 9 _ 0(log m u). The performance of the partitioned 
gmad_ tree algorithms is obtained as follows. Each 
partition has approximately * leaf nodes, where p is 
the number of available processors. The height of 
the tree associated with each partition is 0(log m J). 
Thus within a partition, a single update operation such 
as search and insert, requires T p = 0(log m ~) time. 
Therefore, the speedup for a single search and insert 
is S p = % = 0(g£4). Thc best case time for P 
simultaneous operations, where each processor or par- 
tition handles one operation, is also T p = 0(log m j), 
and speedup for p operations is S p = 0( ? How- 
ever, in the extreme worst case, all p operations may 
be performed in the same partition, and the parallel 
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time would be T p = 0(pIog m By carefully pipelin- 
ing, one may be able to reduce this time complex- 
ity. For the construction algorithm, sequential time 
is 0(v\og m v). The tree creation time for each par- 
tition is T p = 0(f log TO j), and hence the speedup is 

Let L s be the time spent waiting for locks by the 
search process. Then the search time required in the 
lock-based algorithm is 0(log m v + L.,). If L» represents 
the lock time spent by insert, then the traditional lock- 
based insert time is 0(\og m v + Li). The speedup for 
a single search operation is S p = 0( i 0 g!*7+"O - At 
most p operations take place at any one time, therefore 
5 p - 0( lo P?°fa* ). A similar analysis holds for insert. 

4. Performance Evaluation 

The B mftd -tree algorithms were implemented for 
locked, partitioned and distributed approaches. We 
focus on the search and insert operations, as well as 
on the construction and restructure. As expected, 
the B mad -trees showed better performance than other 
B Hn *-trees, especially for locked implementations. 

4.1. Implementation Environments 

The partitioned and locked algorithms were imple- 
mented on a Sequent Symmetry S/81 shared memory 
system with 16 processors, running under DYNDC O.S., 
a version of Unix 4.2bsd. The Sequent is a tightly cou- 
pled multiprocessor with 32-bit microprocessors and a 
shared bus. All the processors are identical. The pro- 
grams were written in parallel C language. Measure- 
ments were taken using the Sequent's system clock, and 
are given in microseconds. 

Distributed algorithms were implemented on a net- 
work that included DEC Alpha workstations and Intel 
486 based PCs, running the Parallel Virtual Machine 
(PVM) version 3.3 software package. This software al- 
lows a network of heterogeneous computers to run as 
a single multicomputer system, called a virtual ma- 
chine. It is a laser defined collection of systems that 
can include serial, parallel and vector computers - all 
machines run under the UNIX. Once the virtual ma- 
chine is defined, PVM includes routines to automati- 
cally spawn tasks on the member computers, and al- 
lows them to communicate and synchronize with each 
other. 

4.2. Performance Measurements 

Relatively small trees are used as test data due to 
limited memory space available in our system. The 
B mafll -trees considered have index nodes of size m = 8, 



and leaf nodes of size = 20. The initial trees were con- 
structed with full index nodes. Unlike other B- trees, 
it is efficient to construct the B mfld -tree with full in- 
dex nodes, since subsequent inserts have no immediate 
effect on the index. This yields a denser, shorter (in 
height), and therefore a more efficient tree. The fol- 
lowing time measurements were taken as appropriate 
for each algorithm, 

• Total ran time (T tolal ): It measures the total time 
to process all the data in the input stream. It 
includes T* n5er t> T 0 /i, and T/ or fc. T^ c jt is in- 
cluded for lock-based algorithms, and T comm for 
distributed algorithms. 

• Insert time (T in ^ T t): It measures the time for in- 
sertion, including the times Ti OC k and T 0 f create 

• Overhead time (T 0 /J: It measures the time needed 
to get the data ready, and the time required to 
access the input stream. 

• Process Creation time (Tf 0r k): It measures only 
the time needed to spawn the parallel processes. 

• Lock time (T loc k): It measures the time spent 
obtaining and releasing locks. The system func- 
tions to obtain and release locks are sJock() and 
s_unlock(). It does not include the amount of time 
the lock is held while updating locked data. 

• Send time (T aen d): This is the time required to 
send messages in a distributed environment. 

• Receive time (T recv> ): It measures the time waiting 
for messages to arrive. 

• Communication time (T comm ): This measures the 
total time spent in communication overhead. It is 
likely that T 8cnd and T recv overlap at least par- 
tially. We could not, however, accurately mea- 
sure the amount of overlap. Therefore, T 

max{T se nd, Trecv}- 

• Overflow creation time (T 0 f create)' This measures 
the time to create new overflow nodes during insert 
operations. 

• Restructure time (T Ttstr )'. This measures the time 
required to restructure the tree. Note that this 
time is a background time and almost never inter- 
feres with other tree operations. 

4.3. Results for Search and Insert 

The performance of the partitioned algorithms for 
B mad -trees is good, and gradually improves as the num- 
ber of transactions grows. The best results were ob- 
served at 10,000 insertions, even though the tree has 
now tripled in size. Figure 4 shows the components 
of the total insert time, T< oia /, for an initial tree of 
5,000 keys with 10,000 subsequent inserts. This in- 
cludes overhead time, fork time, search time and insert 
time. Both the pure insert and search operations show 
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steady increase in the speedup as the number of pro- 
cessors (p) increases, with the higher speedup observed 
for the search operation. The speedup saturates be- 
yond p = 10, due to the linear growth of T/ or * with 
p. However, the overhead time, T 0 h, remains constant. 
Figure 5 shows a typical result for a smaller number 
of transactions. There is good speedup through p = 6, 
when the linear increase in T/ or * causes a decline in 
performance. 



Bmad Partitioned Total Insert Performance 
Initial: 5.000; 10,000 Inserts 




-•-Overhead -©-Fork ♦Search Insert ^ TOTAL 



Figure 4. T tota i (in microseconds) for B' 
tree with 10,000 inserts 



Partitioned Bmad Total Insert Performance 
Initial; 5,000; 5,000 Inserts 
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-•-Overhead ^Fork -"-Search Insert H TOTAL 

Figure 5. T total (in microseconds) for B™ d - 
tree with 5,000 inserts 

One should not think that the insertion in the B mad - 
tree yields a better performance at the cost of increased 
search time, especially as the tree grows and the num- 
ber of overflow nodes increases. The average search 
time per key, computed as It ^f 1 -^ shows only a slight 
increase in the search time. The average insert time 
per key, T ^ r * , decreases as n grows larger. Thus the 
average total time per key, decreases or remains 

constant. This trend is persistent, regardless of the 
number of processors in the system. A typical result is 



shown in Figure 6. 



Bmad-tree Partitioned: Average S/l Time per Key 
Initial Tree: 5,000; 10 Processors 




2,500 5,000 7.SOO T 0,000 

Number ot Inoorto 
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Figure 6. Average time per key in B mad -tree 

In the implementations for the distributed algo- 
rithms on a network of processors created with PVM, 
we were able to spawn many more processors than on 
the shared memory system. However, at about 18 to 
20 processors, the results lost much of their significance 
because partitions become very small. We present re- 
sults for up to p = 22 in which the dominant features 
are the spawn and communication costs. These results 
are consistent for all combinations of initial trees and 
subsequent transactions. Figure 7 shows speedup per- 
formance. It includes costs for total insert, spawning 
the processes and communication. When evaluating 
these results however, we should remember that the 
process creation costs are a one-time cost in nearly all 
instances. Once the initial virtual machine is created, 
that cost will not reoccur unless a processor fails. Sim- 
ilarly, if communication costs are concentrated at the 
startup, they diminish proportionately as the process- 
ing time and the number of transactions increases. 



Distributed Bmad-troe Total Cost 
Initial tree: 5,000; Insert 10,000 




■*- Spawn Communication -"-Insert EflTotal Cost 



Figure 7. T {ota , for distributed B mod -tree 
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Figure 8 shows the results for the locked B mad -trees, 
which show good speedup initially but subsequent loss 
of performance due to the linear increase in fork time. 
We expected that the altered leaf node structure of 
the B mod -tree and the elimination of index adjustments 
during insertions would reduce T loc k costs, but results 
were even better than expected, and Tiock time is con- 
sistently the smallest component within the insert time. 

Table I gives speedup information for the various 
B fnAd -tree algorithms, where P, L and D respectively 
denote 'partitioned', 'locked 5 , and 'distributed 1 . Effi- 
ciency is defined as the ratio of speedup to p, and pro- 
cessor utilization is optimal when efficiency is 0(1). In 
our experiments, the partitioned algorithms achieve the 
highest efficiency, up to 0.81 for p = 8 when no over- 
head is included, and 0.60 when overhead is included. 
The efficiency for the locked algorithms with p = 8 is 
up to 0.48 (no overhead) and 0.31 (with overhead). For 
the distributed algorithms with p = 8, efficiency is as 
high as 0.85 (no overhead) and as low as 0.01 (with 
overhead) . 

Bmad Traditional Lock Total Insert 
initial: 5,000; 5,000 Inserts 



Table 1. Speedup for the B^-tree 




Numnor of Procass 



Number of 
Transactions 



No Overhead 



Wit 



Vith Overhead 



8 processors 



5,000 keys 
10,000 keys 



6.48 
6.57 



3.79 
3.85 



6.76 
6.94 



4.69 
4.79 



2.00 
2.49 



0.66 
1.22 



12 processors 



5,000 keys 
10,000 keys 



9.04 
8.68 



4.43 
4.83 



10.97 
9.19 



5.79 
5.75 



1.74 
2.53 



0.50 
1.07 



clearly a false result, since the restructure runs simul- 
taneously with other processing, or in the background. 

4.5. Comparison With Others 

Figure 9 compares the total performance {T tota i) of 
the B mad -tree with other concurrent data structures, 
including B iin *-trees and linear hashing. The locked 
B u " fc -tree remains the most expensive solution, but 
distributed solutions are also expensive as the num- 
ber of processors in the virtual machine grows. The 
locked B mnri -tree and linear hashing and all the par- 
titioned algorithms give the best results. The insert 
times obtained for B mftd -tree are 50% to 60% less than 
those for the B' infc -tree in partitioned implementations. 
The results arc even better for locked algorithms, where 
the B mad -tree implementations run in 70% to 80% less 
time than those for the locked B u "*-tree algorithms. 
Figure 10 gives the results excluding overhead costs. 
It show continuing speedup for all algorithms, except 
for locked linear hashing. Note that the data for lock- 
based B link -trees are not included, as they are an order 
of magnitude higher than the other results obtained. 

Compare All Algorithms - Total Cost 
Initial: 5,000; Insert 5,000 
Bmad (BM). Blink (BL), Linear Hashing (LH) 



I"— Overhead ^ Forte -«-Saarch — Insert I%)tOTAL 

Figure 8. J total for locked B™<Mree 

4.4. Performance Results for Restructure 

As expected, restructuring the tree is expensive. In 
partitioned and distributed algorithms, T r€ str shows 
good speedup. M p = 2 T rcs tr is as much as T to ta(, 
but good speedup reduces that cost significantly as p 
increases. Similar but slightly better results were ob- 
tained in the distributed algorithms. 

In the locked solutions, the results are viewed in 
two different ways. Like others, the restructure costs 
are measured sequentially. Since for locked algorithms 
restructuring happens on the entire tree, not on a par- 
tition, the cost is not distributed over all the proces- 
sors. Thus Trestr is l^ge and remains essentially the 
same, independent of the number of processors. This is 




1 — BM Part, -o- BM Lock ■* BM Dlslr. — BL Pari. 
l ^-BL LocK ^LHPart. *** LH Lock <>LH Pistr. | 

Figure 9. Comparing T tota < for all algorithms 

5. Conclusions and Future Research 

This research started with a simple premise. Many 
parallel and concurrent algorithms have been proposed 
for B-trees, but very few ever were implemented on real 
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Compare Alt Algorithms - Insert Only 
Initial: 5,000; Insert 5.000 
Bmad (BM). Blink (BL), Llnoar Mashing (LH) 




e a 10 12 14 
Number of Procoesors 



-w- BM Pan. -»BM Lock "--BM Dtstr — BL Pari. 
■*-LW Port. -*LH Lock "°" LH OUtr. 



Figure 10. T inse rt for all algorithms 

machines. Most, of t hose were tested on simulated mul- 
tiprocessors and hence very limited data on the com- 
parative performance of concurrent B-tree and hashing 
algorithms are available. 

An improved B hn *-tree, called the BT^-tree, is pro- 
posed in this paper. It allows multiple updates in 
the same node, reduces the number of locks required 
during updates and the length of time locks are held, 
and delays tree restructuring. The implementation re- 
sults show that the B mftd -tree algorithms perform bet- 
ter compared to other existing B-tree implementations. 
The locked B Wfl *-tree algorithms require only 20% to 
30% of the time required by the B' irlfc -trees, and 40% 
to 50% of the time by the partitioned B' tn *-trees. The 
B mod -trees even outperformed the linear hashing, a fast 
hash table algorithm. The high cost of the restructure 
algorithm does not affect the overall performance, as 
it is intended to run in the background. Further inves- 
tigations in network and communications aspects are 
needed to determine how to achieve better performance 
in the distributed algorithms. 

Our future research includes the development of 
other algorithms for the B mad -trees, such as traversal, 
range search and non-trivial delete. The load balanc- 
ing issues in the proposed parallel algorithms will also 
be studied by experiments. 
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